Generalized Gamma-CUSUM control chart with application of COVID-19 deaths

The increase in the number of infections and the worrisome state of mortality linked to the COVID-19 pandemic demand an optimal statistical model and efficient monitoring scheme to analyze the deaths. This paper aims to model the COVID-19 mortality in Nigeria using four non-normal distributions grouped under the generalized gamma distribution, by specifying the best-fit distribution to model the number of deaths linked to the COVID-19 pandemic. In addition, a control chart to monitor the COVID-19 deaths based on the best-fit distribution is proposed. The performance of the proposed Gamma-CUSUM chart as a monitoring scheme was compared with the standard normal-CUSUM chart. The results revealed that the Gamma-CUSUM chart first signals a change in the number of deaths on day 68 while there was no change in the number of deaths for the standard normal-CUSUM chart. Also, the exact point of change was visible on the Gamma-CUSUM chart which was impossible on a standard normal-CUSUM control chart.


Introduction
The novel coronavirus SAR-CoV-2 virus (also known as COVID- 19) [1] is a highly contagious disease first detected in Wuhan, China in December 2019 [2]. It is one of the rare diseases that has constituted the biggest threat to the existence of humanity and has negatively impacted the economies of many countries. The virus spreads primarily from person to person through respiratory droplets [3]. Some of the symptoms of the disease may include fever, cough shortness of breath, chills, muscle pain, and loss of taste or smell which may appear 2-14 days after exposure [1,4]. The infection rate and the number of deaths due to the COVID-19 pandemic across the globe are high with a new strain of the virus discovered recently. However, to mitigate the effect of the pandemic, considerable research effort has led to the development of vaccines that have been approved for use in reducing the impact of COVID-19 infections and deaths.
To date, COVID-19 cases have been confirmed in over 188 countries on different continents, Africa and including Nigeria [5]. Globally, the number of confirmed cases as of 7 th October 2021 is 236 132 082 and the number of deaths due to COVID-19 is given as 4 822 472. In Nigeria, the first index case of COVID -19 was detected on 27 th February 2020 [6]. Since the first index case in Nigeria, there has been a gradual rise in the number of confirmed cases, recoveries, and mortality. Daily records of confirmed cases, recoveries, and deaths including cumulative figures are provided by Nigeria Centre for Disease Control (NCDC). As of 7 th October 2021, the total number of confirmed COVID-19 cases, discharged cases, active cases, and deaths in Nigeria as given by NCDC are 206920, 194651, 9471, and 2740 respectively [6].
To detect the presence of a virus, tests are conducted on individuals; an infected person is isolated and quarantined to prevent the spread of the virus. However, there is apprehension about the total number of confirmed cases in Nigeria due to low testing. Recently, there has been an improvement in the number of COVID-19 testing centers and an increase in testing capacity across the states in Nigeria due to the support of international donor agencies and the Federal government. Also, the government has given non-pharmaceutical guidelines to mitigate the spread of the virus through social distancing rules, isolation and quarantine, use of nose masks, hand washing, and smaller gathering of people at a time. This had a positive effect in controlling the outbreak, but, with a substantial loss in economic and social costs during the lockdown [7]. Yet, these measures have not helped in preventing deaths among the infected and vulnerable people.
Since the outbreak, many studies have been undertaken to estimate the growth rates and understand the transmission dynamics of COVID-19. Zhao et al. [8] estimated the growth rate of COVID-19 infection in China using the exponential growth model. Kucharski et al. [9] investigated the transmission dynamics of COVID-19 infection using a mathematical model to assess the effectiveness of several control measures. Chen et al. [10] proposed a mathematical model for estimating the transmissibility rate of the coronavirus and showing a higher transmissibility rate for COVID-19 than for some other viruses. Statistical analysis and monitoring of COVID-19 data were also explored by some researchers using sampling plans under neutrosophic statistics. Aslam et al. [11] proposed a gamma control chart based on generalized multiple dependent states using the COVID-19 mortality data. Sherwani et al. [12,13] examined the performance of the sign and Kruskal Wallis tests under indeterminacy with applications to the COVID-19 reproduction rate and COVID-positive daily occupancy in ICU. Rao et al. [14] proposed a time-truncated sampling plan using COVID-19 data for Weibull distribution under indeterminacy. For more details regarding different studies on the COVID-19 pandemic, interested readers are referred to the work of [15][16][17][18][19][20][21].
Several control charts have been proposed in the literature for the detection of an abnormal process. These control charts are used for detecting large and small-to-moderate shifts in the process variable. Notable among the control charts include the Shewhart chart [22], classical exponentially weighted moving average (EWMA) chart [23], and cumulative sum (CUSUM) chart [24]. While the Shewhart chart is efficient in detecting large shifts, the classical EWMA and CUSUM charts detect small-to-moderate shifts efficiently. New monitoring schemes such as HWMA charts, mixed control charts, and progressive mean charts which are extensions and modifications of the EWMA and CUSUM charts have been studied in the literature, but the efficiency of such monitoring schemes has been criticized by Knoth et al. [25].
Usually, the assumption in the monitoring of process shift is that the statistical distribution of the process variable follows the normal distribution. However, this is not the case in practice, because statistical distribution may follow some non-normal distributions. Hence, monitoring the COVID-19 deaths using control charts to detect abnormal/unnatural variation required identifying an appropriate statistical distribution to analyze the number of COVID-19 deaths. It should be noted that analysis of COVID-19 data is a good indicator and a veritable means of detecting the worrisome state of the effect of the virus in Nigeria and across the globe.
The generalized gamma distribution has been studied by several authors including Agarwal and Al-Saleh [26] who applied generalized gamma to study hazard rates. Nadarajah and Gupta [27] introduced generalized gamma distribution with application to fitting drought data. Cordeiro et al. [28] studied generalized gamma distribution using an exponentiated method and applied it to lifetime and survival analysis. The number of deaths is the focus of this paper based on the fact that death is a factor that eliminates lives and thus challenges the attainment of one of the sustainable development goals (SDGs). Therefore, an appropriate statistical model for the COVID-19 data based on the generalized gamma distribution is desirable for identifying the best-fit parametric model for monitoring the number of deaths.
Hence, this study aims to model the distribution of the number of deaths due to COVID-19 in identifying the appropriate (best-fit) parametric model using generalized gamma distribution to discover important patterns in COVID-19 data over the period under consideration which will enable the Federal government to have firsthand information on curtailing the pandemic in Nigeria. The best-fit mortality distribution modeled will reflect mortality from the different age groups of infected persons in the different states. As a follow-up, it monitors the COVID-19 death for abnormal patterns using the CUSUM chart based on the best-fit distribution, unlike the standard CUSUM which assumes that process data is normally distributed. This will prevent false conclusions to be made using the standard (normal) CUSUM chart. In the subsequent sections, the probability density function (PDF), the cumulative distribution function (CDF) of the generalized Gamma distribution, the description of the COVID-19 data, the procedure for obtaining the best-fit distribution, the design of the Gamma-CUSUM chart, and application to COVID-19 is discussed.

Generalized Gamma Distribution (GGD)
The three-parameter generalized Gamma distribution first introduced by Stacy [29] is a generalization of the two-parameter gamma distribution. It is used to determine which parametric model is appropriate for a set of data. The GGD under consideration is an extremely flexible distribution for data modeling and has respectively PDF and CDF of the form and where λ is the scale parameter, α and β are the shape parameters and γ(.) is the incomplete gamma function. The distribution has the exponential, Weibull, Gamma, and lognormal distributions as special cases and it is often used to identify which parametric model fits a given set of data. The distribution in Eq (1) becomes the exponential distribution if α = β = 1, gamma distribution if β = 1, Weibull distribution if α = 1, and lognormal distribution if α ! 1.

The data
The data used in this study was obtained from Nigeria Center for Disease and Control (NCDC) website (https://ncdc.gov.org). The data consists of daily reported COVID-19 confirmed cases, recoveries, active cases, and the number of deaths between 11 th April and 7 th September 2020. As of 7 th September 2020, the cumulative number of COVID-19 confirmed cases, active cases and COVID-19 deaths are 55160, 10868, and 1061 respectively. Our interest in this paper is the daily reported COVID-19 number of deaths between the periods of study. Exploratory analysis and graphical tests of normality were carried out on the reported number of COVID-19 deaths. Also, the best-fit distribution for COVID-19 data was determined using the generalized gamma distribution, and monitoring the COVID-19 number of death was done using the CUSUM control chart. The exploratory analysis of the reported COVID-19 deaths shows that the coefficient of skewness is 1.1471, the coefficient of Kurtosis is 4.908, and the mean value was greater than the median value. This implies that the data is positively skewed. A plot of the COVID-19 number of deaths in Nigeria within the period of study was presented in Fig 1 and a further graphical presentation using a QQ plot and density plot is presented in Fig 2. The plots in Fig 2 further confirm that the distribution of the data is positively skewed and heavily tailed. Thus, a CUSUM chart based on the normal distribution will not be appropriate to study the data. Hence, there is a need to search for the appropriate model of distribution for the data.

Best-fit distribution for the data
The COVID-19 death is a count data recorded over a time interval. Here, the distribution of COVID-19 number of deaths is modeled in this paper with the generalized gamma distribution (GGD) to obtain the best-fit distribution which is important in the design of the CUSUM control chart. The mortality distribution modeled here will reflect mortality from the different age brackets of infected cases in the different states and regions. The model can approximate the dynamics of COVID-19 and discover important patterns in the data. The generalized gamma distribution which encompasses the exponential, gamma, and Weibull distribution presents a flexible family in the varieties of shapes and hazard function which make it suitable for modeling real-life data to determine which parametric model is appropriate for the COVID-19 data. The distribution of the number of deaths due to COVID-19 is modeled with the gamma distribution for convenience where within the range of observation (data) we consider whether or not discretized continuous approximation fits the data since the data is an aggregate of deaths in the region where sparse data or no data on death due to COVID-19 are recorded.
To fit the best distribution to the data, we used the fitdist function in the fitdistrplus package in the R language developed by [30] and fit the GGD on the COVID -19 number of deaths. The parameters of the four generalized gamma distributions were estimated using the maximum likelihood estimation (MLE) method to obtain the model that best described the data. Table 1 gives the goodness-of-fit statistics and criteria for the four flexible distributions.
Using the exponential distribution model, the parameters of the distribution were determined to be 0.13 with a standard error of 0.011 for the rate parameter. The log-likelihood, Akaike Information Criterion (AIC), and Bayesian Information Criterion (BIC) for the model are -426.64, 855.28, and 858.24, respectively. For the gamma distribution model, the parameters of the distribution were determined to be 1.97 with a standard error of 0.217 for the shape parameter and 0.26 with a standard error of 0.03 for the rate parameter. The log-likelihood, AIC, and BIC for the model are -410.87, 825.74, and 831.65, respectively. Furthermore, using the Weibull distribution model, the parameters of the distribution were determined to be 1.48 with a standard error of 0.096 for the shape parameter and 8.23 with a standard error of 0.492 for the scale parameter. The log-likelihood, AIC, and BIC for the model are -411.48, 826.96, and 832.87, respectively. Finally, using the Lognormal distribution model, the parameters of the distribution were determined to be 1.72 with a standard error of 0.07 for the scale parameter and 0.80 with a standard error of 0.048 for the shape parameter. The log-likelihood, AIC, and BIC for the model are -415.61, 835.23, and 841.14 respectively.
From the above results, it is clear that the gamma distribution best describes statistically the distribution of the number of deaths due to COVID-19 in this study. Therefore, we developed a Gamma-CUSUM control chart to investigate the COVID-19 deaths in Nigeria.

Design of generalized Gamma-CUSUM control chart
The proposed CUSUM chart for monitoring the COVID-19 number of deaths is designed based on the gamma distribution which is the best-fit distribution for the COVID-19 number of deaths in Nigeria. We monitor upward shifts in the COVID-19 number of deaths and proposed a one-sided Gamma CUSUM control chart. The upper Gamma-CUSUM statistic is defined by where X i represents the number of deaths that follow the gamma distribution, C þ n is the gamma CUSUM score for some case n, K is the reference value and C þ 0 is the non-negative head-start given as C þ 0 ¼ y 0 . When C þ n > H, the system signals an out-of-control condition, indicating that the COVID-19 number of deaths has exceeded the control limit. In designing the Gamma-CUSUM chart, the choice of K and H are fundamental in the application. The reference value K is obtained as a log-likelihood ratio given by (cf. [31]) where γ is a fixed shape parameter, θ 0 is an in-control scale parameter if known orŷ 0 if it is estimated and θ 1 is an out-of-control scale parameter and is the decision limit which depends on the shape parameter γ, the ratio y 1 = y 0 and it is chosen to give a pre-specified in-control average run length (IC ARL) performance. The ARL is the expected number of samples to signal an abnormal condition which is one of the most commonly used measures of evaluating the performance of a control chart. A good control chart is expected to have a large ARL value for an in-control process. The ARL performance of the control chart for non-normal distributions has been studied by many researchers including Varderman and Ray [32] that derived the exact ARL value for the exponential distribution. Acosta-Mejia et al [33] assessed the performance of a CUSUM chart using ARL under chisquare distribution. The theoretical analysis of the ARL of the Gamma-CUSUM chart was studied by Huang et al. [31] who evaluated the run-length distribution of the CUSUM chart for monitoring changes in the scale parameter under gamma distribution based on the piecewise collocation method.

Application to COVID-19 data
For monitoring the COVID-19 number of deaths in Nigeria using the control chart, the CUSUM chart with the best-fit distribution is applied. Though the CUSUM chart is usually based on the assumption that the quality characteristic follows the normal distribution, however, in this study, using the daily reported COVID-19 number of deaths within the period of study, the shape parameter of the gamma distribution was estimated to be 1.9669 and rate parameter is 0.2650. Thus, the in-control scale parameter is estimated as θ 0 = 3.77365. Suppose our objective is to detect a 25% increase in the COVID-19 number of deaths (note that percentage values less than 25% could also be considered in detecting an increase in the number of deaths), then the new scale parameter will be estimated to be θ 1 = 4.71706. Thus, y 1 = y 0 ¼ 1:25. Therefore, the reference value using Eq (3) will be K = 8.281377.
The approximate threshold, h for IC ARL value of 370 obtained from [31] given that the shape parameter γ � 2 is determined as 13.2. Hence, the decision limit for the Gamma- Therefore, the CUSUM statistic C þ n signals an out-of-control at the first n for which C þ n > 49:81. The summary of the upper Gamma-CUSUM statistic C þ n for the COVID-19 number of deaths is presented in Table 2. A plot of the statistic in Table 2 From Fig 3, it can be observed that the control chart first signals an out-of-control on day 68. The Figure also reveals that the COVID-19 number of deaths has been on a consistent rise cumulatively from day 68 to day 128 followed by a gradual decline for the remaining days. This study has established that the COVID-19 number of deaths in Nigeria remained in out-

PLOS ONE
of-control for the next 60 days. Hence, the reasons for the consistent rise in the COVID-19 number of deaths within this period in Nigeria and the gradual decline thereafter required further investigation. The period 17 th June to 15 th August is regarded as the rainy season where the weather is very cold. Though there is no scientific evidence of a correlation between the number of infections/deaths and weather conditions, further research can be considered in temperate regions to verify this observation. Furthermore, the performance of the standard normal-CUSUM control chart is evaluated assuming that the COVID-19 data follow the normal distribution for the period of study and is compared with the proposed Gamma-CUSUM chart. A plot of the standard normal-CUSUM control chart statistics in Table 3 when the COVID-19 data follow the normal distribution is presented in Fig 4. A comparison of the plots in Figs 3 and 4 revealed that the Gamma-CUSUM detect out-ofcontrol signals on day 68 whereas the standard normal-CUSUM chart didn't signal an out-ofcontrol. This establishes the adequacy and efficiency of the Gamma-CUSUM chart for the monitoring of COVID -19 number of deaths.

Conclusion
In this study, the reported COVID-19 number of deaths was modeled using the generalized gamma family of distributions. The best-fit distribution was obtained as the gamma distribution for the COVID-19 number of deaths extracted from the NCDC website for the period 11 th April to 7 th September 2020. Thereafter, a Gamma-CUSUM control chart was developed for monitoring the COVID-19 number of deaths in Nigeria. The results show that the COVID-19 number of deaths consistently rises between 68 days to 120 days, indicating that the COVID-19 number of deaths within the period was beyond the control limits and requires further investigations to ascertain the problem therein. Also, the performance of the Gamma-CUSUM chart was compared with the standard normal-CUSUM chart and the results revealed the superiority of the Gamma-CUSUM chart. The results showed that the distributional knowledge of infectious diseases is essential for efficient monitoring of infections and deaths arising from the infections. The proposed study can be extended for neutrosophic statistics as future research work when the data is from an uncertain environment. Similarly, the CUSUM chart for the generalized gamma distribution to monitor time-between-COVID 19 deaths is another future research work worth investigating.
Supporting information S1 Table.